BPS: a Format-Preserving Encryption Proposal

نویسندگان

  • Eric Brier
  • Thomas Peyrin
  • Jacques Stern
چکیده

In recent months, attacks on servers of payment processors have led to the disclosure of tens of millions of credit card numbers (also known as Personal Account Numbers, PANs). As an answer, end-to-end encryption has been advocated and an encryption standard that preserves the format of the data would be welcome. More generally, a format-preserving encryption scheme would be welcomed for many real-life applications. Unfortunately, this request falls in an area that is not yet adequately covered by cryptography theory: direct constructions [1,20] have not received enough attention to be considered for standardization, and constructions based on Feistel schemes (as proposed by [5,4]) sufer from the lack of tight exact security estimates. Very recently, the use of unbalanced Feistel schemes has been suggested and a precise security bound, based on Markov chains, has been derived [12]. However, the bound comes at the cost of a large number of calls to the underlying cipher. In this paper, we present a generic format-preserving symmetric encryption algorithm BPS, which can cipher short or long string of characters from any given set. In particular, this construction ofers a tweak capability, very useful in practice when the user would like to cipher very small strings of data. We also provide particular instances for the case of PANs ciphering. Very recently, a similar proposal has been independently submitted to the NIST standardization process [3]. Most block ciphers from the industry and the academic world handle binary domain {0, 1}, with a block size often equal to n = 64 bits or 128 bits. While those ciphers are clearly dealing with the most useful cases in practice, what if one wants to design a cipher that maintains another message domain M whose cardinality |M| is arbitrary? For example, such a primitive could be really useful in applications where the data manipulated is composed of digits and not bits, as it is the case for credit-card numbers (PANs). Of course, it is always possible to use a standardized block cipher with a larger binary domain M' = {0, 1} (|M| ≤ |M'|) and to use extra data felds coming with the ciphertext to restore an equivalent format. However, we are looking here for elegant constructions that are not based on any engineering trick and that produce ciphertexts with strictly no expansion. In practice, the expansion is equivalent as breaking the format, which many actors of the communication channel may not support. Several dedicated block ciphers have recently been proposed to answer this challenge for particular situations [20,6,1,9]. Yet, in practice it would be interesting to have a construction that uses already standardized block ciphers or hash functions such as TDES [14], AES [15] or SHA-2 [13] as internal primitive. In particular, those primitives are the most likely to be available on hardware. Black and Rogaway [5] provided a theoretic study of this problem. In their article, three potential constructions of arbitrary fnite domains cipher have been proposed. The frst method, named prefx cipher, uses as internal primitive a cipher E' with a larger domain than |M| and defnes the permutation EK (i) by frst K computing all the |M| ciphertexts j = E' (i) of messages i with 0 ≤ i ≤ |M| − 1 and by K sorting them according to their value. The ordinal position in the sorted table of values j corresponding to the query i gives EK (i). The second method, named cycle-walking cipher, also uses a cipher E ' with a larger domain than |M|. For a plaintext i, one outputs the K value j = E ' (i) if j ∈ M. The out-of-range ciphertexts are simply treated by applying the K permutation E ' again until one reaches the domain M. Finally, the last method, named K generalized-Feistel cipher, uses a Feistel construction [7] with some random functions Fi and modular additions. This construction maintains two branches with domains L and R such that |M| ≤ L × R. When |M| < L × R, out-of-range ciphertexts may be reached and the construction is then combined with the cycle-walking cipher (i.e. the permutation is applied again until a valid ciphertext is reached). The frst method is interesting for small values of |M| but is completely unpractical otherwise since 2|M| time and memory are required in order to start using the cipher. The second method is practical but presents a drawback: the duration of a ciphering process is not deterministic. This could be a problem in some applications, even if the potential threat of timing attacks should not be harmful (see [4]). Finally, the last method seems to be the most elegant and promising one, even if the best known security proof yet only achieves a birthday paradox bound (for the binary case, better proofs are known [17,18]). More precisely, the analysis is an adaptation of the well known Luby-Rackof security proof [11] and it shows that when the attacker is limited to access less than Q = 2min{L,R}/2 plaintext/ciphertext pairs, she has not enough information to distinguish this construction from a random permutation with domain M. This proof holds whatever the computing power of the attacker is. However, for intermediate values of |M|, one can assume that the attacker can indeed access to Q queries in practice. For example, let's consider the case of the encryption of credit-card numbers between two parties. Only about a dozen digits are unpredictable in a credit-card number, thus we consider M = {0, . . . , 9} and |M| = 10. In this case, the generalized-Feistel cipher birthday proof [5] ensures security up to 1000 plaintext/ciphertext pairs. Note that a proposal by Spies [21] combining balanced Feistel networks and the cycle-walking technique has been submitted to the NIST in 2008. A frst improvement would be to design a tweakable block cipher [10,8] instead, as recently published by Bellare et al. [4]. In this case, the designer is ensured that much more plaintext/ciphertext pairs are necessary in order to attack the scheme (since these pairs are likely to use diferent tweak values). In our previous example, the attacker would have to get 1000 plaintext/ciphertext pairs with the same tweak value instead of 1000 random plaintext/ciphertext pairs. In parallel, another route has recently been taken by Morris, Rogaway and Stegers [12], who used highly unbalanced Feistel schemes. Using the theory of Markov chains, the authors were able to derive exact security bounds. Despite their attractive features, these bounds come at the cost of a large number of calls to the underlying cipher due to many Feistel rounds, which might make them unsuitable in practice. Very recently, a proposal combining the tweak feature [4] and the new security proofs techniques [12] has been submitted as a NIST proposal [3]. A second improvement would be to increase the expected security of the more conservative approach based on balanced Feistel schemes by improving the proven security bounds. In our case, that would mean going beyond the birthday bound. To achieve this, one would naturally draw his inspiration from Patarin's recent work on Feistel networks security [16,17,18,19], also crossing the birthday bound barrier. However, since the domain size can be small, we are aiming here at concrete security instead of asymptotic security, and this task seems quite difcult for the time being. Note however, that, contrary to unbalanced Feistel networks, the best bound one can achieve is O(2), since it is always possible for a computationally unbounded adversary to distinguish a r-round Feistel cipher manipulating n-bit blocks from a random permutation with r × 2 queries (by simply trying to guess all the r unknown internal functions used). This does not disqualify balanced Feistel networks since they seem to require much less calls to the underlying cipher. Our contribution. We propose a simple yet very fexible format-preserving encryption algorithm. Our proposal can cipher short or long strings composed of characters from any set. BPS can use any standardized primitives such as TDES [14], AES [15] or SHA-2 [13] as internal brick. 1 The Generic BPS Cipher For the description, we will use the following notations and the little-endian order. Assume that one wants to cipher strings of characters from a set S, with s representing the cardinality of that set: s = |S|. For example, we have S = {0, 1} and s = 2 in the case of bits, or S = {0, 1, 2, 3, 4, 5, 6, 7, 8, 9} and s = 10 in the case of digits. The only parameter that matters here is the cardinality of the set of characters S, since one can always defne a bijective mapping from S to {0, 1, . . . , s −1}. Thus, in the following, we will only deal with integers in {0, 1, . . . , s − 1} each representing a character in S, which we call a s-integer. We denote by len the length of the string ST to cipher, i.e. len = |ST |, and we also denotes by ST [i] the i-th s-integer of the string ST starting the counting from the left, with 0 ≤ i < len. For example, with s = 16, if ST = 0 14 2 11 we have len = 4 and ST [0] = 0, ST [1] = 14, ST [2] = 2, ST [3] = 11. Then, ST ||ST ' denotes the concatenation of the strings ST and ST ' , i.e. if ST ' = 8 15 6 4 then ST ||ST ' = 0 14 2 11 8 15 6 4. BPS makes an extensive use of the modular addition (resp. subtraction) that we denote by E (resp. E). When we will write C = AE B (mod x), we consider that A, B, C ∈ N. Of course, we have C ∈ {0, . . . , x − 1}. Moreover, we will use the bitwise exclusive or (XOR) operation that we denote c = a ⊕ b and where a, b, c are bit-words of the same length. BPS is built upon two components: an internal length-limited block cipher (which itself uses an internal function such as TDES [14], AES [15] or SHA-2 [13]) and a mode of operation in order to handle long strings. The two next sections respectively describe the two components. 1.1 The Internal Block Cipher BC We denote by BC the internal cipher of BPS, distinguishing the encryption and the decryption processes by BC and BC−1 respectively. We instantiate the cipher according to the cardinality s of the characters set and according to the block length b of the cipher we are building. Thus, Y = BCF,s,b,w(X, K, T ) denote the w-round encryption (an even number) of a s-integer string X of length b, with key K and the 64-bit tweak value T . Of course, since we are building format-preserving encryption, the output string Y will also be composed of b s-integers. We denote by f the number of output bits of the internal function F . We have the natural restriction that at least two characters must be ciphered, i.e. b ≥ 2. Also, the bit length k of the key K is limited according to the internal function F used. If F is a f -bit block cipher that manipulates k ' -bit keys, then we require that k = k ' . In the case of F being a HMAC construction [2] with a f -bit hash function, one can use a key of arbitrary length. We denote by FK (x) the application of the block cipher E with the key K on the plaintext x (FK (x) = EK (x)), or the application of the HMAC construction with the hash function H and the key K on the message x (FK (x) = HMAC[H]K (x)). We frst divide the 64-bit tweak T into two 32-bit sub-tweaks TL and TR, i.e. if T (T − TR)/2 is considered as a 64-bit integer, then TR = T mod 2 and TL = . Then, we divide the s-integer input string X of length b into two sub-strings XL and XR of similar length l and r respectively, i.e. X = XL||XR. More precisely, if b is even, XL = X[0] . . . X[l − 1] and XR = X[l] . . . X[l + r − 1], where l = r = b/2. If b is odd, XL = X[0] . . . X[l − 1] and XR = X[l] . . . X[l + r − 1], where l = (b + 1)/2 and r = (b − 1)/2. The internal state of the cipher is composed of two branches L and R, each of f − 32 bits. We impose the last restriction (the explanation is given in later sections): b ≤ maxb, with maxb = 2 × log (2 −32) . s For example, when using AES as internal function, each branch will be represented by a 96-bit integer and when ciphering digits (s = 10) we would have the restriction b ≤ 56. We give in Table 1 the maximal value maxb for b according to s and the internal function F used. Table 1. Maximal value maxb for the number b of input s-integers of BC and BC−1, according to the characters set cardinality s and the internal function used. s = 2 bits s = 10 digits s = 61 TDES 64 18 10 AES 192 56 32 SHA-2 448 134 74 The encryption BC is composed of w simple Feistel-like rounds, and each of them will update the right or left branch in turn. We denote by Li (resp. Ri) the left (resp. right) branch value after application of round i. The left and right branches are initialized with XL and XR respectively: l−1 L0 = XL[0].s + XL[1].s + . . . + XL[l − 1].s r−1 R0 = XR[0].s + XR[1].s + . . . + XR[r − 1].s When the encryption process BC is instantiated with a block cipher E, for each 0 ≤ i < w we apply the round function (see Figure 1): Li+1 = Li E EK ((TR ⊕ i).2 −32 + Ri) (mod s) if i is even Li+1 = Li if i is odd Ri+1 = Ri if i is even Ri+1 = Ri E EK ((TL ⊕ i).2f−32 + Li) (mod s) if i is odd Finally, the output string Y is the concatenation of YL and YR, i.e. Y = YL||YR with YL and YR built by decomposing Lw and Rw into the s basis: 1 Conceptually, the two branches always maintain the formating and thus the left branch manipulates data in {0, . . . , s l −1} and the right branch manipulates data in {0, . . . , s r −1}. However, those branches are always coded on (f − 32)-bit integers for consistency with the concatenation function.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Synopsis of Format-Preserving Encryption

Format-preserving encryption (FPE) encrypts a plaintext of some specified format into a ciphertext of the same format—for example, encrypting a social-security number into a social-security number. In this survey we describe FPE and review known techniques for achieving it. These include FFX, a recent proposal made to NIST.

متن کامل

Breaking the FF3 Format Preserving Encryption

The NIST standard FF3 scheme (also known as BPS scheme) is a tweakable block cipher based on a 8-round Feistel Network. We break it with a practical attack. Our attack exploits the bad domain separation in FF3 design. The attack works with chosen plaintexts and tweaks when the message domain is small. Our FF3 attack requires O(N 11 6 ) chosen plaintexts with time complexity N, where N is domain...

متن کامل

Recommendation for Block Cipher Modes of Operation: Methods for Format-Preserving Encryption

This Recommendation specifies three methods for format-preserving encryption, called FF1, FF2, and FF3. Each of these methods is a mode of operation of the AES algorithm, which is used to construct a round function within the Feistel structure for encryption.

متن کامل

VAES3 scheme for FFX

VAES stands for variable AES. VAES3 is the third generation format-preserving encryption algorithm that was developed in a report [4] simultaneously with the comprehensive paper on FPE [1] and subsequently updated slightly to be in concert with the FFX standard proposal. The standard proposal of FFX includes, in an appendix, example instantiations called A2 and A10. A follow on addendum [3] inc...

متن کامل

Notes on Property - Preserving Encryption

The first type of specialized encryption scheme that can be used in secure outsourced storage we will look at is property-preserving encryption. This is encryption where some desired property of the plaintexts is intentionally leaked by the ciphertexts. The two main examples we will study are deterministic encryption, which preserves the equality property, and order preserving encryption, which...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010